Identifying ions from mass spectral data

ABSTRACT

A method for identify isotope patterns in mass spectral data, comprising obtaining a desired mass spectral peak shape function; obtaining mass spectral data composed of actual isotope patterns to be analyzed; calculating theoretical isotope pattern from known elemental composition of at least one basic ion whose isotope pattern is representative of the ions to be analyzed, by using mass spectral peak shape function; comparing quantitatively corresponding parts of the theoretical isotope pattern to that of the mass spectral data; calculating a numerical metric to measure similarity between the theoretical isotope pattern and actually measured isotope pattern; and utilizing the numerical metric as an indication for possible presence of ions whose isotope patterns resemble that of the basic ion. A computer for and a computer readable medium having computer readable code thereon for performing the methods. A mass spectrometer having an associated computer for performing the methods.

This application claims priority, under 35 U.S.C. §119(e), fromprovisional patent applications Ser. No. 60/941,656 filed on Jun. 2,2007 and 60/956,692 filed on Aug. 18, 2007. The entire contents of theseapplications are incorporated herein, in their entireties.

CROSS REFERENCE TO RELATED PATENT APPLICATIONS/PATENTS

The entire contents of the following documents are incorporated hereinby reference in their entireties:

U.S. Pat. No. 6,983,213; International Patent ApplicationPCT/US2004/013096, filed on Apr. 28, 2004; U.S. patent application Ser.No. 11/261,440, filed on Oct. 28, 2005; International Patent ApplicationPCT/US2005/039186, filed on Oct. 28, 2005; International PatentApplication PCT/US2006/013723, filed on Apr. 11, 2006; U.S. patentapplication Ser. No. 11/754,305, filed on May 27, 2007; InternationalPatent Application PCT/US2007/069832, filed on May 28, 2007. U.S. patentapplication Ser. No. 11/830,772 which was filed on Jul. 30, 2007 andwhich claims priority from provisional patent application Ser. No.60/833,862 filed on Jul. 29, 2006.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to mass spectrometry systems. Moreparticularly, it relates to mass spectrometry systems that are usefulfor the analysis of complex mixtures of molecules, including large andsmall organic molecules such as proteins or peptides, environmentalpollutants, pharmaceuticals and their metabolites, and petrochemicalcompounds, to methods of analysis used therein, and to a computerprogram product having computer code embodied therein for causing acomputer, or a computer and a mass spectrometer in combination, toaffect such analysis.

2. Prior Art

In drug metabolism studies, researchers typically create a radio-labeledversion of the parent drug before dosing the drug in animal or humantest subjects. Through biotransformations, the drug will be transformedinto its metabolites, between just a few to as many as 50-70metabolites. By detecting and following the radioactivity, researcherscan trace these bio transformations and account for the metabolites. Thesample is typically injected into an LC/MS system for analysis, wherevarious metabolites are separated in (retention) time and detected bymass spectrometry. While these metabolites can be traced by a radioactivity detector in a split flow arrangement in parallel to massspectrometry, the identification of these metabolites will ultimatelyhave to rely on mass spectrometry due to its mass (m/z) measuringcapability. Unfortunately in many cases, the biological sample, evenafter extensive clean-up, sample preparation, and LC separation, stillsuffers from significant matrix or background ion interferences, makingmetabolite identification a time-consuming and tedious process. To helpwith the mass spectral identification of possible metabolites,researchers may dose test subjects with a mixture of the native andradio-labeled compound, creating a unique mass spectral signature thatis easier for researchers to spot in a mass spectrum. Subject tolimitations on total dosage, radioactivity exposure for a given testspecies, mass spectral saturation, and the uncertainty surrounding theratio between the native and the radio-labeled version of the drug,metabolite identification remains a daunting task for researchers, evenwith the aid of radioactivity tracing.

After an ion has been identified to be possibly drug-related, it istypically required to then confirm its elemental composition beforestructural elucidation through further MS/MS experimentation, or evenisolation for NMR analysis. Due to the various backgrounds present,typically, higher resolution mass spectrometry is desired in order toavoid interference from the matrix or background ions. Higher resolutionmass spectrometry systems such as TOF, qTOF, Orbi-Trap, or FT ICR MS,offer two distinct advantages: less spectral interferences and highermass accuracy. Even with elaborate calibration schemes such as lockmass, dual spray, and internal calibration, obtaining unique elementalcomposition remains a challenge at the extremely high mass accuracy of100 ppb.

A previous approach, as in U.S. Pat. No. 6,983,213 and InternationalPatent Application PCT/US2005/039186, filed on Oct. 28, 2005, provides anovel method for calibrating mass spectra for improved mass accuracy andline shape correction to improve the ability to perform elementalcomposition analysis or formula identification.

Very high mass accuracy can be obtained on so-called unit massresolution systems in accordance with the techniques taught in U.S. Pat.No. 6,983,213.

Accurate line shape calibration provides a highly reliable metric toassist in unambiguous formula identification by matching the measuredspectra to calculated candidate formulas, as in International PatentApplication PCT/US2005/039186, filed on Oct. 28, 2005.

However, obtaining unique elemental composition from conventional tohigh resolution mass spectrometry systems remains a challenge topractitioners of mass spectrometry.

Thus, there exists a significant gap between what current mass spectralsystem can offer, and what is being achieved at the present usingexisting technologies for mass spectral analysis.

SUMMARY OF THE INVENTION

It is an object of the invention to provide a mass spectrometry systemand a method for operating a mass spectrometry system that overcomes thedisadvantages described above, in accordance with the methods describedherein.

It is another object of the invention to provide a storage media havingthereon computer readable program code for causing a mass spectrometrysystem to perform the method in accordance with the invention.

An additional aspect of the invention is, in general, a computerreadable medium having thereon computer readable code for use with amass spectrometer system having a data analysis portion including acomputer, the computer readable code being for causing the computer toanalyze data by performing the methods described herein. The computerreadable medium preferably further comprises computer readable code forcausing the computer to perform at least one of the specific methodsdescribed.

Of particular significance, the invention is also directed generally toa mass spectrometer system for analyzing chemical composition, thesystem including a mass spectrometer portion, and a data analysissystem, the data analysis system operating by obtaining calibratedcontinuum spectral data by processing raw spectral data; generally inaccordance with the methods described herein. The data analysis portionmay be configured to operate in accordance with the specifics of thesemethods. Preferably the mass spectrometer system further comprises asample preparation portion for preparing samples to be analyzed, and asample separation portion for performing an initial separation ofsamples to be analyzed. The separation portion may comprise at least oneof an electrophoresis apparatus, a chemical affinity chip, or achromatograph for separating the sample into various components.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing aspects and other features of the present invention areexplained in the following description, taken in connection with theaccompanying drawings, wherein:

FIG. 1 is a block diagram of a mass spectrometer in accordance with theinvention.

FIG. 2 is flow chart of the steps in the identification of isotopicallysimilar ions used by the system of FIG. 1.

FIG. 3A to FIG. 3F are graphical representations of the some resultsobtained during the process of FIG. 2.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Referring to FIG. 1, there is shown a block diagram of an analysissystem 10, that may be used to analyze proteins or other molecules, asnoted above, incorporating features of the present invention. Althoughthe present invention will be described with reference to the singleembodiment shown in the drawings, it should be understood that thepresent invention can be embodied in many alternate forms ofembodiments. In addition, any suitable types of components could beused.

Analysis system 10 has a sample preparation portion 12, other detectorportion 23, a mass spectrometer portion 14, a data analysis system 16,and a computer system 18. The sample preparation portion 12 may includea sample introduction unit 20, of the type that introduces a samplecontaining proteins, peptides, or small molecule drug of interest tosystem 10, such as Finnegan LCQ Deca XP Max, manufactured by ThermoElectron Corporation of Waltham, Mass., USA. The sample preparationportion 12 may also include an analyte separation unit 22, which is usedto perform a preliminary separation of analytes, such as the proteins tobe analyzed by system 10. Analyte separation unit 22 may be any one of achromatography column, an electrophoresis separation unit, such as agel-based separation unit manufactured by Bio-Rad Laboratories, Inc. ofHercules, Calif., and is well known in the art. In general, a voltage isapplied to the unit to cause the proteins to be separated as a functionof one or more variables, such as migration speed through a capillarytube, isoelectric focusing point (Hannesh, S. M., Electrophoresis 21,1202-1209 (2000), or by mass (one dimensional separation)) or by morethan one of these variables such as by isoelectric focusing and by mass.An example of the latter is known as two-dimensional electrophoresis.

The mass spectrometer portion 14 may be a conventional mass spectrometerand may be any one available, but is preferably one of MALDI-TOF,quadrupole MS, ion trap MS, qTOF, TOF/TOF, or FTMS. If it has a MALDI orelectrospray ionization ion source, such ion source may also provide forsample input to the mass spectrometer portion 14. In general, massspectrometer portion 14 may include an ion source 24, a mass analyzer 26for separating ions generated by ion source 24 by mass to charge ratio,an ion detector portion 28 for detecting the ions from mass analyzer 26,and a vacuum system 30 for maintaining a sufficient vacuum for massspectrometer portion 14 to operate efficiently. If mass spectrometerportion 14 is an ion mobility spectrometer, generally no vacuum systemis needed and the data generated are typically called a plasmagraminstead of a mass spectrum.

In parallel to the mass spectrometer portion 14, there may be otherdetector portion 23, where a portion of the flow is diverted, for nearlyparallel detection of the sample in a split flow arrangement. This otherdetector portion 23 may be a single channel UV detector, a multi-channelUV spectrometer, or Reflective Index (RI) detector, light scatteringdetector, radioactivity monitor (RAM) etc. RAM is most widely used indrug metabolism research for ¹⁴C-labeled experiments where the variousmetabolites can be traced in near real time and correlated to the massspectral scans.

The data analysis system 16 includes a data acquisition portion 32,which may include one or a series of analog to digital converters (notshown) for converting signals from ion detector portion 28 into digitaldata. This digital data is provided to a real time data processingportion 34, which processes the digital data through operations such assumming and/or averaging. A post-processing portion 36 may be used to doadditional processing of the data from real time data processing portion34, including library searches, data storage and data reporting.

Computer system 18 provides control of sample preparation portion 12,mass spectrometer portion 14, other detector portion 23, and dataanalysis system 16, in the manner described below. Computer system 18may have a conventional computer monitor or display 40 to allow for theentry of data on appropriate screen displays, and for the display of theresults of the analyses performed. Computer system 18 may be based onany appropriate personal computer, operating for example with a Windows®or UNIX® operating system, or any other appropriate operating system.Computer system 18 will typically have a hard drive 42, on which theoperating system and the program for performing the data analysisdescribed below is stored. A drive 44 for accepting a CD or floppy diskis used to load the program in accordance with the invention on tocomputer system 18. The program for controlling sample preparationportion 12 and mass spectrometer portion 14 will typically be downloadedas firmware for these portions of system 10. Data analysis system 16 maybe a program written to implement the processing steps discussed below,in any of several programming languages such as C++, JAVA or VisualBasic.

As mentioned in the U.S. Pat. No. 6,983,213, for a given standard ion ofknown elemental composition, the acquired profile mode mass spectraldata y₀ and its theoretical counterpart y are related to each otherthrough

(g{circle around (×)}y ₀)=(g{circle around (×)}y){circle around(×)}p  Equation 1

where {circle around (×)} represents convolution, g represents a smallGaussian, and p represents the mass spectral peak shape function. Wheny₀, y, and g are known, the mass spectral peak shape function p can bereadily calculated through deconvolution.

When the measured y₀ is a linear combination of two ions at varyingrelative signal levels, such as the native and radio labeled version ofa small molecule drug, additional parameters need to be introduced, suchthat:

y ₀ =c ₁ y _(1,0) +c ₂ y _(2,0)  Equation 2

y=c ₁ y ₁ +c ₂ y ₂  Equation 3.

As long as the two additional parameters c₁ and c₂ are known or theirratio c₁/c₂ or c₂/c₁ is given, the same approach outlined in U.S. Pat.No. 6,983,213 can be used to arrive at the peak shape function p. Whentheir relative concentrations are not known, as is the case in drugmetabolism research, due to incomplete isotope replacement reaction, aniterative approach to arrive at c₁/c₂ and p has been disclosed inInternational Patent Application PCT/US2005/039186, filed on Oct. 28,2005 and International Patent Application PCT/US2006/013723, filed onApr. 11, 2006.

While generally producing excellent results, there are situations inwhich an iterative approach is not preferred due to at least twoconsiderations: it may be computationally extensive and its convergenceis not always guaranteed. For this reason, a more direct,computationally efficient, and reliable approach will be disclosed hereas a preferred embodiment described below, which is described herein, ina few distinct steps:

-   -   a. Examining the theoretical isotope cluster from the native and        the corresponding isotope labeled version, one finds that the        two vectors y₁ and y₂ are, for the most part, simply shifted        version of each other. For example, the Verapamil native drug        C₂₇H₃₉N₂O₄ ⁺ and its radio-carbon labeled version ¹⁴CC₂₆H₃₉N₂O₄        ⁺ for the most part are shifted version of each other with the        radio-labeled version shifted on the mass axis by        14.00324-12.00000 or +2.00324Da. Based on this observation, one        could proceed by setting c₂ (or c₁) to zero and c₁ (or c₂) to        one (in Equations 2 and 3) and perform the deconvolution in        Equation 1 to calculate a peak shape function p, which would        contain the true peak shape function duplicated twice with        2.00324Da spacing in between. While one could perform peak        detection and peak selecting to select one of the two peaks as        the peak shape function and jump to step c, a preferred approach        will be described next, which is more generally applicable, even        at less than unit mass resolution, and with stable ¹³C isotope        labeling, where the mass spacing between the two peaks would be        less, and reliable separation of the peaks becomes more        challenging.    -   b. Treat the above calculated p as y₀ in Equation 1, create a        new y through Equation 3 by deleting all other isotopes except        for the monoisotopes from the theoretically calculated isotope        distribution y₁ and y₂, and set c₁=1 and c₂=some initial        estimate (or vice versa with c₂=1 and c₁=some initial estimate,        with no loss of generality in the following descriptions). Note        that y₁ and y₂ thus created would be spaced exactly the same        mass distance apart as in the two peaks in p (new y₀). Typically        the initial estimate for c₂ can be easily obtained from the        sample preparation process. Applying another round of        deconvolution based on Equation 1 creates a new peak shape p,        which now contains primarily one single peak shape function. To        the extent that c₂ is in error, there will still be a small peak        in the deconvoluted p, either negative or positive, depending on        the sign of the error in c₂. This small extra peak can be zeroed        out to arrive at a cleaned-up version of p.    -   c. Treat c₁ and c₂ from Equation 3 as unknowns, insert Equation        3 into Equation 1 and simplify to arrive at

y ₀ =c ₁(y ₁ {circle around (×)}p)+c ₂(y ₂ {circle around(×)}p)  Equation 4

which can now be solved to obtain updated values for c₁ and c₂ using thesame y₀, y₁, and y₂ from step b above and the cleaned-up version of p,through, for example, least squares linear regression.

-   -   d. Repeat steps a-c above as necessary using the updated        concentrations c₁ and c₂ Typically only one more calculation        through step a. results in a true peak shape function p with the        complication from interfering ion(s) completely removed.

It should be noted that with the monoisotopic peak from the native ionfrom a higher resolution system, where the monoisotopic peak is baselineresolved from other isotopes, the true peak shape function p can bedirectly obtained without iteration, and the relative concentrations c₁and c₂ can be obtained from the above Equation 4 in a single step. Oncethe true peak shape function p is obtained, one may proceed with themass spectral calibration as referenced in U.S. Pat. No. 6,983,213 tocalibrate for the mass axis while also transforming the peak shape intoa desired or target peak shape function that is mathematicallydefinable. Alternatively, but less desirably, one could leave the rawmass spectral data as is, except that the peak shape function is nowknown and numerically represented by p. This completes Step 230 in FIG.2.

One can now move to the next stage, Step 240 in FIG. 2, to construct anion pattern to be searched in the rest of the mass spectral data for thepossible presence of similar or “resembling” ions that would also showsimilar isotope patterns. This is useful for researchers in the drugmetabolism area where a parent drug along with its unique isotopepattern gives rise to various metabolites exhibiting similarly uniqueisotope patterns. The unique isotope pattern may come from the parentdrug itself due to the presence of Br or Cl elements in its elementalcomposition, or from the mixing of the native drug with its isotopelabeled version. If the metabolites contain the same number of Br or Clelements as the drug itself and the drug metabolism pathway isindifferent with respect to certain isotopes (which is the case in mostapplications), similar isotope patterns will be observed for themetabolites. Since various metabolites come at different masses andchromatographic retention times, it is difficult and time consuming tospot these isotope patterns in a typical LC/MS run that generates a datamatrix on the order of 4000 time points by 8000 mass points in thepresence of matrix and background ions typical of biological samples.

This similarity in isotope patterns among the parent drug and itsvarious metabolites will now be exploited for an automatic algorithm toidentify the possible presence of these resembling ions without actuallyknowing their precise elemental compositions. Once the peak shapefunction p has been obtained along with the concentration ratios c₁/c₂between the two basic ions (those of know chemical composition, such as,for example, a parent drug, the isotope labeled version of the parentdrug, a known fragment of the parent drug or its isotope labeledversion, a known metabolite or its fragment, and the isotope labeledversion of the known metabolite or its fragment from drug metabolismstudies; e.g. those of a composition that is know or has already beendetermined), a mass spectral isotope pattern t can be established by

t=(c ₁ y ₁ +c ₂ y ₂){circle around (×)}p  Equation 5

where y₁ and y₂ are theoretically calculated from the elementalcompositions of the basic ion and its isotope labeled version,respectively (Step 240 in FIG. 2). This isotope pattern t can be used tofit to a segment of mass spectral data r through the following model

r=Kc+e  Equation 6

where r is an (n×1) matrix of the profile mode mass spectral data,digitized at n m/z values; c is a (k×1) matrix of regressioncoefficients which are representative of the concentrations of kcomponents in matrix K; K is an (n×k) matrix composed of profile modemass spectral responses for the k components, all sampled at the same nm/z points as r; and e is an (n×1) matrix of a fitting residual withcontributions from random noise and any systematic deviations from thismodel. The k columns of the matrix K will contain the isotope pattern t(for example, in its first column, without the loss of generality, foreasy subsequent description) and any background or baseline components,which may or may not vary with mass (as additional columns). A leastsquare solution to Equation 6 leads to

=K ⁺ r  Equation 7

where K⁺ (dimensioned as k×n) is the pseudo inverse of the matrix K, aprocess well established in matrix algebra, as referenced in U.S. Pat.No. 6,983,213; International Patent Application PCT/US2004/013096, filedon Apr. 28, 2004; U.S. patent application Ser. No. 11/261,440, filed onOct. 28, 2005; International Patent Application PCT/US2005/039186, filedon Oct. 28, 2005; and International Patent ApplicationPCT/US2006/013723, filed on Apr. 11, 2006.

Note that in Equation 7, each row in K⁺ serves as a digital filterapplied to the mass spectral segment r to arrive at a concentrationvector c containing the contribution of each component, including theion isotope pattern t and any components included in matrix K. Thesedigital filters in K⁺ can be calculated once in a limited mass spectralrange and then applied to a mass spectral segment in an extended massrange in a sliding window, much like a convolution filter, in Step 240in FIG. 2, to generate a concentration vector c from Equation 7 for eachretention time and mass location combination. In the special case wherethere is just one column involved in matrix K, i.e., no baseline orbackground involved besides the isotope pattern t itself, it can beproved that the digital filter is of the same form as the isotopepattern t itself, subject to a scaling factor. A residual can becalculated for each such retention time and mass location combination as

=r−K

  Equation 8

in Steps 250 and 260 in FIG. 2. This residual vector can be furtherreduced into a scalar by taking the 2-norm, i.e., the square root of thesum of squares of all elements involved, or root mean square error, andconverted into a relative residual error as

e=∥e∥ ₂ /∥r∥ ₂

While one can relate this residual error directly to the likelihood forthe presence of a resembling ion, it may be more convenient intuitivelyto convert this residual error into a numeric metric that increases whenthe measured isotope pattern more closely resembles the given isotopepattern t given in Equation 5. This numeric metric may be equal to thet-statistic or one minus the p-value as disclosed in U.S. Pat. No.6,983,213 and U.S. patent application Ser. No. 11/754,305, filed on May27, 2007; corresponding to International Patent ApplicationPCT/US2007/069832, filed on May 28, 2007, or some other appropriatefunction of the residual error. This corresponds to Step 280 in FIG. 2.

FIG. 3A shows the average of a few mass spectral scans from a retentiontime window corresponding to a faint radioactivity monitor (RAM) signaland FIG. 3B shows a corresponding resemblance weight factor calculatedas:

$w_{i} = {r_{i}2^{- \frac{e_{i}}{a}}}$

where the subscript i refers to mass spectral data point i, r_(i) ande_(i) are the mass spectral raw signal and residual corresponding tomass spectral data point i based on the above calculations using a massspectral segment centered around mass spectral data point i, and a is auser-settable parameter that takes on the form of:

a=0.15, for e_(i)<0.15 or 15% relative residual error

a=0.05, for e_(i)≧0.15 or 15% relative residual error

Comparing the zoomed-in versions of FIGS. 3A and 3B shown in FIGS. 3Cand 3D, respectively, it is clear which mass spectral region containsions of high resemblance to the basic ions and one may proceed furtherin identifying the elemental compositions for these high likelihoodions, using the approach outlined in International Patent ApplicationPCT/US2005/039186, filed on Oct. 28, 2005, and International PatentApplication PCT/US2006/013723, filed on Apr. 11, 2006.

These high likelihood ions and their elemental compositions are reportedout by computer 18 (FIG. 1) by being displayed on the monitor 40 and/orby printing on a printer (not shown) associated with computer 18.

Since all weights across the mass spectrum can be summed up into a totalweight and plotted out as a function of chromatographic retention time(FIG. 3F), a time-dependent data trace very similar to a RAM data trace(shown in FIG. 3E) in nature can be generated, where each peak indicatesthe retention time point where a possible high resemblance ions exist.It should be pointed out, however, that RAM is only applicable to casesinvolving radio-labeled compound, whereas this novel approach does notrequire radioactivity for detection, and may actually be more sensitivedue to the typically higher sensitivity available through massspectrometry detection. Due to the lack of RAM sensitivity, it issometimes required, in in vitro experiments, to work with 100%radio-labeled compound without mixing-in the native compound, causingpossible cell or enzyme deaths and making mass spectral identificationof drug-related ions difficult without the unique isotope pattern from a50%:50% mixture. With this new approach, one can reduce the amount ofradioactivity exposure to test subjects or even eliminate it completelyby working with stable isotopes such as ¹³C labeling instead. On theother hand, in the presence of good quality RAM signal, the abovecalculations can be significantly sped up, by focusing only on theretention time region where there is a rise in RAM signal.

For reasons discussed in U.S. Pat. No. 6,983,213; International PatentApplication PCT/US2004/013096, filed on Apr. 28, 2004; U.S. patentapplication Ser. No. 11/261,440, filed on Oct. 28, 2005; InternationalPatent Application PCT/US2005/039186, filed on Oct. 28, 2005;International Patent Application PCT/US2006/013723, filed on Apr. 11,2006; and U.S. patent application Ser. No. 11/754,305, filed on May 27,2007; International Patent Application PCT/US2007/069832, filed on May28, 2007, it is preferred to carry out all of the above calculationsusing the profile mode mass spectral data and have the raw profile modedata calibrated for both mass and peak shape. The above calculationscan, however, be carried out in centroid mode, with or without peakshape calibration, with inferior results. In this case, the peak shapefunction described in this application becomes a delta function withjust one non-zero element in the entire peak shape vector.

While the description above uses a pair of two ions as basic ions foreasy discussion, the same approach applies to cases involving 3 or moreions. For example, when there are 2 ¹⁴C replacements with incompletereaction, it is possible to have a mixture as a linear combination ofnative, one ¹⁴C labeled, and two ¹⁴C labeled ion. Identical process andalgorithm can be utilized for these multiple ¹⁴C labeling experiment bysimply augmenting the relevant matrices including K, c, K⁺, and addingy₃. Although there appears to be three concentration elements in thiscase, there are actually only two independent concentration elements dueto the closure rule:

c ₃=1−c ₁ −c ₂

which can be utilized to reduce the number of unknowns estimated andimprove the numerical and statistical stability of the calculations. Asa special case, when there is only one ion involved as the basic ion forthe metabolism study of a Br— or Cl— containing drug, all of the abovecalculations and algorithms still apply, except that there are noconcentration estimate steps b, c, or d.

For all the analysis described above, it may be advantageous totransform the m/z axis into another more appropriate axis before hand,to allow for analysis with a uniform peak shape function in thetransformed axis, as pointed out in U.S. Pat. No. 6,983,213 andInternational Patent Application PCT/US2004/034618 filed on Oct. 20,2004.

The process described above includes a fairly comprehensive series ofsteps, for purposes of illustration, and to be complete. However, thereare many ways in which the process may be varied, including leaving outcertain steps, or performing certain steps before hand or “off-line”.For example, it is possible to follow all the above approaches byincluding disjoining isotope segments (segments that are not continuouswith respect to one another, but have spaces between them in thespectrum), especially with data measured from higher resolution MSsystems, so as to avoid the mass spectrally separated interference peaksthat are located within, but are not directly overlapped, with theisotope cluster of an ion of interest. Furthermore, one may wish toinclude only the isotopic peaks that are not overlapped withinterferences in the above analysis, using exactly the same vector ormatrix algebra during the quantitative comparison Step 250 in FIG. 2. Ifthe disjoining isotope segments pose a mathematical difficulty in termsof derivative calculations, one may consider zero-filling the left outregions in the isotope cluster before the relevant calculations orleaving out the regions with interferences after the derivativecalculations. Lastly, one may wish to perform a weighted regression fromEquation 1 to 8 to better account for the signal variance, as referencedin U.S. Pat. No. 6,983,213.

Although the matrix operation is used to describe the process includingEquation 1 to 8, its mathematical equivalence such as digital filtering,convolution, deconvolution, correlation, auto-correlation, regression,optimization, and fitting may also be utilized to the same effect, as iswell known by one skilled in the art of digital signal processing andnumerical analysis.

This invention discloses an approach to calculate or calibrate theactual peak shape function in order to achieve the best possibleresults. One may bypass this actual peak shape function and insteadsimply assume a peak shape function to proceed with the ion isotopepattern identification, with somewhat inferior results.

It is noted that the terms “mass” and “mass to charge ratio” are usedsomewhat interchangeably in connection with information or output asdefined by the mass to charge ratio axis of a mass spectrometer. This isa common practice in the scientific literature and in scientificdiscussions, and no ambiguity will occur, when the terms are read incontext, by one skilled in the art.

The methods of analysis of the present invention can be realized inhardware, software, or a combination of hardware and software. Any kindof computer system—or other apparatus adapted for carrying out themethods and/or functions described herein—is suitable. A typicalcombination of hardware and software could be a general purpose computersystem with a computer program that, when loaded and executed, controlsthe computer system, which in turn control an analysis system, such thatthe system carries out the methods described herein. The presentinvention can also be embedded in a computer program product, whichcomprises all the features enabling the implementation of the methodsdescribed herein, and which—when loaded in a computer system (which inturn control an analysis system), is able to carry out these methods.

Computer program means or computer program in the present contextinclude any expression, in any language, code or notation, of a set ofinstructions intended to cause a system having an information processingcapability to perform a particular function either directly or afterconversion to another language, code or notation, and/or reproduction ina different material form.

Thus the invention includes an article of manufacture, which comprises acomputer usable medium having computer readable program code meansembodied therein for causing a function described above. The computerreadable program code means in the article of manufacture comprisescomputer readable program code means for causing a computer to effectthe steps of a method of this invention. Similarly, the presentinvention may be implemented as a computer program product comprising acomputer usable medium having computer readable program code meansembodied therein for causing a function described above. The computerreadable program code means in the computer program product comprisingcomputer readable program code means for causing a computer to effectone or more functions of this invention. Furthermore, the presentinvention may be implemented as a program storage device readable bymachine, tangibly embodying a program of instructions executable by themachine to perform method steps for causing one or more functions ofthis invention.

It is noted that the foregoing has outlined some of the more pertinentobjects and embodiments of the present invention. The concepts of thisinvention may be used for many applications. Thus, although thedescription is made for particular arrangements and methods, the intentand concept of the invention is suitable and applicable to otherarrangements and applications. It will be clear to those skilled in theart that other modifications to the disclosed embodiments can beeffected without departing from the spirit and scope of the invention.The described embodiments ought to be construed to be merelyillustrative of some of the more prominent features and applications ofthe invention. Thus, it should be understood that the foregoingdescription is only illustrative of the invention. Various alternativesand modifications can be devised by those skilled in the art withoutdeparting from the invention. Other beneficial results can be realizedby applying the disclosed invention in a different manner or modifyingthe invention in ways known to those familiar with the art. Thus, itshould be understood that the embodiments has been provided as anexample and not as a limitation. Accordingly, the present invention isintended to embrace all alternatives, modifications and variances whichfall within the scope of the appended claims.

1. A method for identify isotope patterns from mass spectral data,comprising: obtaining a desired mass spectral peak shape function;obtaining mass spectral data composed of actual isotope patterns to beanalyzed; calculating theoretical isotope pattern from known elementalcomposition of at least one basic ion whose isotope pattern isrepresentative of the ions to be analyzed, by using mass spectral peakshape function; comparing quantitatively corresponding parts of thetheoretical isotope pattern to that of the mass spectral data;calculating a numerical metric to measure similarity between thetheoretical isotope pattern and actually measured isotope pattern; andutilizing the numerical metric as an indication for possible presence ofions whose isotope patterns resemble that of the basic ion.
 2. Themethod of claim 1, wherein the desired peak shape function is one ofassumed peak shape function, actual measured peak shape function, actualcalculated peak shape function, target peak shape function from a massspectral calibration involving peak shape, and a delta function if theobtained mass spectral data is centroid data.
 3. The method of claim 1,wherein the actual isotope pattern is measured in profile mode.
 4. Themethod of claim 3, wherein measurement of similarity between thetheoretical isotope pattern and the actually measured isotope pattern isperformed at a MS resolution higher than unit mass resolution.
 5. Themethod of claim 4, wherein the measured isotope pattern is converted tohave a desired peak shape function.
 6. The method of claim 5, wherein adesired peak shape function is one of assumed peak shape function,actual peak shape function, measured and calculated peak shape function,and target peak shape function from a mass spectral calibrationinvolving peak shape.
 7. The method of claim 1, wherein the actualisotope pattern is a linear combination of at least two basic ionsand/or their fragments.
 8. The method of claim 7, wherein the at leasttwo basic ions and/or their fragments include native and isotope labeledversions of the ion or fragment.
 9. The method of claim 1, wherein theactual isotope pattern is generated by an ion with signature elements,with distinct isotope patterns.
 10. The method of claim 1, wherein themeasured mass spectral response is calibrated to have a desired peakshape function.
 11. The method of claim 10, wherein a desired peak shapefunction is one of assumed peak shape function, actual peak shapefunction, measured and calculated peak shape function, and target peakshape function from a mass spectral calibration involving peak shape.12. The method of claim 1, wherein the theoretical isotope pattern iscalculated by convolution of isotope distribution and a desired peakshape function.
 13. The method of claim 12, wherein a desired peak shapefunction is one of assumed peak shape function, actual peak shapefunction, measured and calculated peak shape function, and target peakshape function from a mass spectral calibration involving peak shape.14. The method of claim 12, wherein the isotope distribution istheoretically calculated from the elemental composition of at least onebasic ion.
 15. The method of claim 12, wherein the theoretical isotopepattern is calculated as a linear combination of more than one basic ionwith the linear combination coefficients being at least one of a userinput and calculated from the actual isotope patterns.
 16. The method ofclaim 7, wherein the linear combination coefficients of relevant ionsare calculated from the actual isotope patterns and the known elementalcompositions of these relevant ions.
 17. The method of claim 16, whereinthe calculation of the linear combination coefficients of relevant ionsinvolves at least one of convolution, deconvolution, matrixmultiplication, matrix inversion, and iteration.
 18. The method of claim1, wherein the basic ion is at least one of the parent drug, the isotopelabeled version of the parent drug, a known fragment of the parent drugor its isotope labeled version, a known metabolite or its fragment, andthe isotope labeled version of the known metabolite or its fragment fromdrug metabolism studies.
 19. The method of claim 1, wherein the ions tobe analyzed are metabolites from drug metabolism studies.
 20. The methodof claim 1, wherein the quantitative comparison comprises at least oneof a digital filtering, matrix multiplication, matrix inversion,convolution, deconvolution, correlation, auto-correlation, regression,and fitting.
 21. The method of claim 20, wherein the quantitativecomparison includes at least one of baseline, backgrounds, and otherknown ions in the same mass spectral range.
 22. The method of claim 1,wherein the numerical metric is derived from residual error.
 23. Themethod of claim 22, wherein the numerical metric is weight calculated asa function of the residual error such that a higher weight correspondsto a smaller residual error and hence a higher probability of presenceof an ion whose isotope pattern resembles that of the basic ion.
 24. Themethod of claim 23, wherein the weight can be summed over the entiremass spectral range into a total weight and plotted as a function ofretention time in LC/MS or GC/MS analysis.
 25. The method of claim 24,further comprising comparing the total weight when plotted as a functionof retention time to the radioactivity trace in radio-labeled drugmetabolism studies.
 26. The method of claim 24, further comprising usingthe total weight when plotted as a function of retention time to replaceoutput data from a radioactivity detector.
 27. The method of claim 1,further comprising determining elemental compositions for ionsresembling the basic ions.
 28. The method of claim 27, wherein the ionsresembling the basic ions include multiple ions.
 29. The method of claim28, wherein the multiple resembling ions follow the same linearcombination relationship as the basic ions.
 30. The method of claim 29,wherein both the basic ion and the resembling ion each contain thenative and the isotope-labeled version of the same ion.
 31. A computerprogrammed to perform the method of claims
 1. 32. The computer of claim31, in combination with a mass spectrometer for obtaining mass spectraldata to be analyzed by said computer.
 33. A computer readable mediumhaving computer readable code thereon for causing a computer to performthe method of claim
 1. 34. A mass spectrometer having associatedtherewith a computer for performing data analysis functions of dataproduced by the mass spectrometer, the computer performing the method ofclaim 1.